智能论文笔记

PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion

Lemeng Wu , Dilin Wang , Meng Li , Yunyang Xiong , Raghuraman Krishnamoorthi , Qiang Liu , Vikas Chandra

分类：计算机视觉

2022-12-12

Fusing camera with LiDAR is a promising technique to improve the accuracy of 3D detection due to the complementary physical properties. While most existing methods focus on fusing camera features directly with raw LiDAR point clouds or shallow 3D features, it is observed that direct deep 3D feature fusion achieves inferior accuracy due to feature misalignment. The misalignment that originates from the feature aggregation across large receptive fields becomes increasingly severe for deep network stages. In this paper, we propose PathFusion to enable path-consistent LiDAR-camera deep feature fusion. PathFusion introduces a path consistency loss between shallow and deep features, which encourages the 2D backbone and its fusion path to transform 2D features in a way that is semantically aligned with the transform of the 3D backbone. We apply PathFusion to the prior-art fusion baseline, Focals Conv, and observe more than 1.2\% mAP improvements on the nuScenes test split consistently with and without testing-time augmentations. Moreover, PathFusion also improves KITTI AP3D (R11) by more than 0.6% on moderate level.

translated by 谷歌翻译

Fast Point Cloud Generation with Straight Flows

Lemeng Wu , Dilin Wang , Chengyue Gong , Xingchao Liu , Yunyang Xiong , Rakesh Ranjan , Raghuraman Krishnamoorthi , Vikas Chandra , Qiang Liu

分类：计算机视觉

2022-12-04

Diffusion models have emerged as a powerful tool for point cloud generation. A key component that drives the impressive performance for generating high-quality samples from noise is iteratively denoise for thousands of steps. While beneficial, the complexity of learning steps has limited its applications to many 3D real-world. To address this limitation, we propose Point Straight Flow (PSF), a model that exhibits impressive performance using one step. Our idea is based on the reformulation of the standard diffusion model, which optimizes the curvy learning trajectory into a straight path. Further, we develop a distillation strategy to shorten the straight path into one step without a performance loss, enabling applications to 3D real-world with latency constraints. We perform evaluations on multiple 3D tasks and find that our PSF performs comparably to the standard diffusion model, outperforming other efficient 3D point cloud generation methods. On real-world applications such as point cloud completion and training-free text-guided generation in a low-latency setup, PSF performs favorably.

translated by 谷歌翻译

LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting

Haichuan Yang , Zhaojun Yang , Li Wan , Biqiao Zhang , Yangyang Shi , Yiteng Huang , Ivaylo Enchev , Limin Tang , Raziel Alvarez , Ming Sun

分类：机器学习 | 人工智能

2022-11-09

This paper proposes a hardware-efficient architecture, Linearized Convolution Network (LiCo-Net) for keyword spotting. It is optimized specifically for low-power processor units like microcontrollers. ML operators exhibit heterogeneous efficiency profiles on power-efficient hardware. Given the exact theoretical computation cost, int8 operators are more computation-effective than float operators, and linear layers are often more efficient than other layers. The proposed LiCo-Net is a dual-phase system that uses the efficient int8 linear operators at the inference phase and applies streaming convolutions at the training phase to maintain a high model capacity. The experimental results show that LiCo-Net outperforms single-value decomposition filter (SVDF) on hardware efficiency with on-par detection performance. Compared to SVDF, LiCo-Net reduces cycles by 40% on HiFi4 DSP.

translated by 谷歌翻译

Learning a Dual-Mode Speech Recognition Model via Self-Pruning

Chunxi Liu , Yuan Shangguan , Haichuan Yang , Yangyang Shi , Raghuraman Krishnamoorthi , Ozlem Kalinli

分类：自然语言处理

2022-07-25

越来越有兴趣将流和全文自动语音识别（ASR）网络统一到单个端到端ASR模型中，以简化两种用例的模型培训和部署。在现实世界中的ASR应用程序中，流媒体ASR模型通常在更多的存储和计算约束（例如，在嵌入式设备上）进行操作，而不是任何服务器端的全文模型。由Omni-Sparsity Supernet训练的最新进展激发，该训练在一个单个模型中共同优化了多个子网，该工作旨在共同学习紧凑的稀疏稀疏式磁性流媒体流动ASR模型，以及一个大型密度服务器非流动模型，在一个超级网。接下来，我们提出，在两种WAV2VEC 2.0自制学习和监督的ASR微调上进行超网训练不仅可以基本上改善先前工作中所示的大型非流式模型，还可以改善紧凑的稀疏流流媒体流模型。

translated by 谷歌翻译

DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

Yonggan Fu , Haichuan Yang , Jiayi Yuan , Meng Li , Cheng Wan , Raghuraman Krishnamoorthi , Vikas Chandra , Yingyan Lin

分类：机器学习 | 计算机视觉

2022-06-02

有效的深层神经网络（DNN）模型配备了紧凑的操作员（例如，深度卷积）在降低DNN的理论复杂性（例如，权重/操作总数）的同时，在保持体面的模型准确性的同时，显示出很大的潜力。但是，由于其通常采用的紧凑型操作员的低硬件利用率，现有的有效DNN仍然受到履行其提高现实硬件效率的承诺的限制。在这项工作中，我们为开发真实硬件有效的DNN开辟了新的压缩范式，从而提高了硬件效率，同时保持模型的准确性。有趣的是，我们观察到，尽管某些DNN层的激活功能有助于DNNS的训练优化和可实现的准确性，但在训练后可以正确删除它们，而不会损害模型的准确性。受到这一观察的启发，我们提出了一个称为DepthShrinker的框架，该框架通过缩小现有有效DNN的基本构建块来开发硬件友好的紧凑型网络，这些构件具有不规则的计算模式，并具有大量改进的硬件利用率，从而将硬件的计算模式缩小到密集的情况下。令人兴奋的是，我们的DepthShrinker框架提供了硬件友好的紧凑网络，既优于最先进的有效DNN和压缩技术方法元元素。我们的代码可在以下网址找到：https：//github.com/facebookresearch/depthshrinker。

translated by 谷歌翻译

Influence of Mobility Restrictions on Transmission of COVID-19 in the state of Maryland -- the USA

Nandini Raghuraman , Kartik Kaushik

分类：机器学习

2021-09-24

背景：Coronavirus，Covid-19首次于2020年在美国检测到。为了抑制3月中旬的疾病的传播，不同的国家发出了强制性宿舍（SAH）订单。这些非药物干预措施是根据先前经验的授权，例如1918年流感流行病。因此，我们决定研究限制对减少Covid-19传输的流动性的影响。方法：我们设计了一项生态时间序列，我们的曝光变量作为马里兰州的移动模式，于2020年3月2020年3月和我们的结果变量与同一时期的Covid-19住院治疗。我们建立了极端梯度升压（XGBoost）集合机器学习模型，并以马里兰不同地区的流动体积回归滞后的Covid-19住院治疗。结果：我们发现Covid-19住院时间增加18％，当流动性增加了5倍，同样在流动性进一步增加了十因素时增加了43％。结论：我们的研究结果表明了流动性与Covid-19例的发生率之间的正线性关系。这些发现与其他研究表明的其他研究是一致的，这表明了移动性限制的益处。尽管需要更详细的方法来精确地了解移动性限制的益处和限制，作为对Covid-19流行的反应的一部分。

translated by 谷歌翻译